Self-supervised Log Parsing

نویسندگان

چکیده

Logs are extensively used during the development and maintenance of software systems. They collect runtime events allow tracking code execution, which enables a variety critical tasks such as troubleshooting fault detection. However, large-scale systems generate massive volumes semi-structured log records, posing major challenge for automated analysis. Parsing records with free-form text messages into structured templates is first crucial step that further Existing approaches rely on log-specific heuristics or manual rule extraction. These often specialized in parsing certain types, thus, limit performance scores generalization. We propose novel technique called NuLog utilizes self-supervised learning model formulates task masked language modeling (MLM). In process parsing, extracts summarizations from logs form vector embedding. This allows coupling MLM pre-training downstream anomaly detection task. evaluate 10 real-world datasets compare results 12 techniques. The show outperforms existing methods accuracy an average 99% achieves lowest edit distance to ground truth templates. Additionally, two case studies conducted demonstrate ability approach log-based both supervised unsupervised scenario. can be successfully support tasks. implementation available at https://github.com/nulog/nulog.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simple Semi-supervised Dependency Parsing

We present a simple and effective semisupervised method for training dependency parsers. We focus on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated corpus. We demonstrate the effectiveness of our approach in a series of dependency parsing experiments on the Penn Treebank, and we show that our clusterbased features yiel...

متن کامل

Simple Semi-supervised Dependency Parsing

We present a simple and effective semisupervised method for training dependency parsers. We focus on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated corpus. We demonstrate the effectiveness of the approach in a series of dependency parsing experiments on the Penn Treebank and Prague Dependency Treebank, and we show that...

متن کامل

Weakly supervised parsing with rules

This work proposes a new research direction to address the lack of structures in traditional n-gram models. It is based on a weakly supervised dependency parser that can model speech syntax without relying on any annotated training corpus. Labeled data is replaced by a few hand-crafted rules that encode basic syntactic knowledge. Bayesian inference then samples the rules, disambiguating and com...

متن کامل

Semi-Supervised Feature Transformation for Dependency Parsing

In current dependency parsing models, conventional features (i.e. base features) defined over surface words and part-of-speech tags in a relatively high-dimensional feature space may suffer from the data sparseness problem and thus exhibit less discriminative power on unseen data. In this paper, we propose a novel semi-supervised approach to addressing the problem by transforming the base featu...

متن کامل

Improved CCG Parsing with Semi-supervised Supertagging

Current supervised parsers are limited by the size of their labelled training data, making improving them with unlabelled data an important goal. We show how a state-of-theart CCG parser can be enhanced, by predicting lexical categories using unsupervised vector-space embeddings of words. The use of word embeddings enables our model to better generalize from the labelled data, and allows us to ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-67667-4_8